Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Administrator, Curator, or StaffJ.B. Nicholson-Owens Date: May 8, 2009 4:14pm
Forum: etree Subject: Download continuation works well as far as I can tell.

I can't duplicate your problem continuing downloads. It seems to me to work just fine:

-----------------------------------------------
$ \wget --version
GNU Wget 1.11.4 (Red Hat modified)

Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic .
Currently maintained by Micah Cowan .

$ \wget --server-response 'http://www.archive.org/download/BigBuckBunny/big-buck-bunny-NTSC.iso'
--2009-05-08 18:13:50-- http://www.archive.org/download/BigBuckBunny/big-buck-bunny-NTSC.iso
Resolving www.archive.org... 207.241.229.39
Connecting to www.archive.org|207.241.229.39|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Date: Fri, 08 May 2009 23:13:50 GMT
Server: Apache/2.2.11 (Ubuntu) PHP/5.2.3-1ubuntu6 mod_ssl/2.2.11 OpenSSL/0.9.8e mod_wsgi/2.3 Python/2.5.1
X-Powered-By: PHP/5.2.3-1ubuntu6
Location: http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso
Vary: Accept-Encoding
Content-Length: 0
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Location: http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso [following]
--2009-05-08 18:13:50-- http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso
Resolving ia360903.us.archive.org... 207.241.231.151
Connecting to ia360903.us.archive.org|207.241.231.151|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.0 200 OK
Connection: keep-alive
Content-Type: application/x-iso9660-image
Accept-Ranges: bytes
ETag: "1375906711"
Last-Modified: Thu, 29 May 2008 23:29:48 GMT
Content-Length: 8294793216
Date: Fri, 08 May 2009 23:13:50 GMT
Server: lighttpd/1.4.18
Length: 8294793216 (7.7G) [application/x-iso9660-image]
Saving to: `big-buck-bunny-NTSC.iso'

0% [ ] 251,688 104K/s ^C

$ \wget --continue --server-response 'http://www.archive.org/download/BigBuckBunny/big-buck-bunny-NTSC.iso'
--2009-05-08 18:13:58-- http://www.archive.org/download/BigBuckBunny/big-buck-bunny-NTSC.iso
Resolving www.archive.org... 207.241.229.39
Connecting to www.archive.org|207.241.229.39|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Date: Fri, 08 May 2009 23:13:58 GMT
Server: Apache/2.2.11 (Ubuntu) PHP/5.2.3-1ubuntu6 mod_ssl/2.2.11 OpenSSL/0.9.8e mod_wsgi/2.3 Python/2.5.1
X-Powered-By: PHP/5.2.3-1ubuntu6
Location: http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso
Vary: Accept-Encoding
Content-Length: 0
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Location: http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso [following]
--2009-05-08 18:13:58-- http://ia360903.us.archive.org/1/items/BigBuckBunny/big-buck-bunny-NTSC.iso
Resolving ia360903.us.archive.org... 207.241.231.151
Connecting to ia360903.us.archive.org|207.241.231.151|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.0 206 Partial Content
Connection: keep-alive
Content-Type: application/x-iso9660-image
Accept-Ranges: bytes
ETag: "1375906711"
Last-Modified: Thu, 29 May 2008 23:29:48 GMT
Content-Range: bytes 286440-8294793215/8294793216
Content-Length: 8294506776
Date: Fri, 08 May 2009 23:13:59 GMT
Server: lighttpd/1.4.18
Length: 8294793216 (7.7G), 8294506776 (7.7G) remaining [application/x-iso9660-image]
Saving to: `big-buck-bunny-NTSC.iso'

0% [ ] 3,227,928 832K/s eta 2h 43m
-----------------------------------------------

wget is a program to download files. As the first command shows, I'm using version 1.11.4 with modifications from Red Hat.

In the second command, I download the NTSC DVD of "Big Buck Bunny". I cancel the download by pressing Control-C.

In the third command I issue an identical command with the "--continue" option which makes wget continue the download where it left off. Had I been willing to wait, I could have continued all the way to completion.

Could it be that you're not using software which does continuation properly, therefore it doesn't work for you? We can't tell because you don't specify what software you're using either here or in the thread you direct us to read.

Reply to this post
Reply [edit]

Poster: Albert Schlef Date: May 11, 2009 12:52am
Forum: etree Subject: Re: Download continuation works well as far as I can tell.

Thanks for checking my report, I appreciate it.

I too use Linux, and I use wget too --the exact version you're using.

I see that the URL you used, "http:// .... /big-buck-bunny-NTSC.iso", indeed supports download resuming. But all other URLs I try don't support it.

For example, try to download the following file:

http://www.archive.org/download/Beverly_Hillbillies_Ep01_The_Clampetts_Strike_Oil/BH01_The_Clampetts_Strike_Oil_512kb.mp4

I download it using...

$ wget --continue http://www.archive.org/download/Beverly_Hillbillies_Ep01_The_Clampetts_Strike_Oil/BH01_The_Clampetts_Strike_Oil_512kb.mp4

...then I press Contorl-C, but when I execute the command again I don't get a "HTTP/1.0 206 Partial Content" like you do. I get a normal "HTTP/1.0 200 OK".

Perhaps only some of your servers support download resuming? Could you please investigate the URL I gave here?

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or StaffJ.B. Nicholson-Owens Date: May 11, 2009 7:52am
Forum: etree Subject: Now I can see that download continuation doesn't always work.

Just to be clear, I don't work for archive.org. I'm just a satisfied user.

With regard to continued downloads on http://www.archive.org/download/Beverly_Hillbillies_Ep01_The_Clampetts_Strike_Oil/BH01_The_Clampetts_Strike_Oil_512kb.mp4 I can duplicate your result.

I don't know enough about their lighttpd configuration to be sure why things are the way they are. Apparently sometimes continuation is not available. Perhaps it is set that way on some servers, perhaps it is set that way on certain MIME types, or certain files.

As I understand it from glancing at lighttpd docs, one can set

server.range-requests = "disable"

based on almost anything. So one could say:

$HTTP["url"] =~ "\.mp4$" {
server.range-requests = "disable"
}

and disable range requests for all filenames ending in ".mp4".

I don't know why one would want to turn off continued downloads at all; I'd imagine file download continuation should be on all the time for any kind of file no matter which server the file comes from. I'd imagine this really hampers streaming files as you'd want to be able to pick up where you left off filling a streaming player buffer.

I hope someone from archive.org responds to you and can illuminate the situation. There might be some variables or policy issues of which I am unaware.

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or StaffNemo_bis Date: Nov 3, 2013 11:59am
Forum: etree Subject: Wayback machine doesn't support the "Range" header AKA wget --continue doesn't work

It's still not working fully, in particular on the Wayback machine: web.archive.org sometimes contains big files, but stops sending them just a few bytes short of 100 MiB: "Connection closed at byte 104857347", says wget. You can retry at will and in few seconds/minutes you get 100 MiB more... but the same chunk.
The wget docs say ┬źNote that -c only works with FTP servers and with HTTP servers that support the "Range" header┬╗, indeed it seems web.archive.org doesn't support it. In my test I see that the response to the second request is a HTTP/1.1 200 OK, not HTTP/1.0 206 Partial Content; it doesn't change if I interrupt the download and resume it manually instead of letting wget retry; the partial file does exist in the directory.

I also tried accepting gzip per some comment on the web, see full output.

$ wget --continue --header "Accept-Encoding: gzip" --tries=0 -S http://web.archive.org/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
--2013-11-03 20:02:21-- http://web.archive.org/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Resolving web.archive.org (web.archive.org)... 207.241.224.26
Connecting to web.archive.org (web.archive.org)|207.241.224.26|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Moved Temporarily
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:02:21 GMT
Content-Type: video/x-msvideo
Transfer-Encoding: chunked
Connection: keep-alive
set-cookie: wayback_server=74; Domain=archive.org; Path=/; Expires=Tue, 03-Dec-13 20:02:21 GMT;
Link: ; rel="original"
Location: /web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Wayback-Perf: [IndexLoad: 9, IndexQueryTotal: 9, RobotsFetchTotal: 2, RobotsRedis: 2, RobotsTotal: 2, Total: 14]
Set-Cookie: wb_total_perf=14; Expires=Sun, 03-Nov-2013 20:03:21 GMT; Path=/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 0
X-Page-Cache: MISS
Location: /web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi [following]
--2013-11-03 20:02:21-- http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Reusing existing connection to web.archive.org:80.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:02:21 GMT
Content-Type: video/x-msvideo
Content-Length: 288092160
Connection: keep-alive
Memento-Datetime: Fri, 10 Aug 2007 11:30:28 GMT
Link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first last memento"; datetime="Fri, 10 Aug 2007 11:30:28 GMT"
X-Archive-Orig-Connection: close
X-Archive-Orig-Content-Length: 288092160
X-Archive-Orig-Content-Type: video/x-msvideo
X-Archive-Orig-ETag: "5d5c-112bf000-4015cefd252c0"
X-Archive-Orig-Server: Apache
X-Archive-Orig-Accept-Ranges: bytes
X-Archive-Orig-Last-Modified: Thu, 22 Sep 2005 14:16:19 GMT
X-Archive-Orig-Date: Fri, 10 Aug 2007 11:30:28 GMT
X-Archive-Wayback-Perf: [IndexLoad: 6, IndexQueryTotal: 6, RobotsFetchTotal: 2, RobotsRedis: 2, RobotsTotal: 2, Total: 31, WArcResource: 24]
Set-Cookie: wb_total_perf=31; Expires=Sun, 03-Nov-2013 20:03:21 GMT; Path=/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 1
X-Page-Cache: MISS
Length: 288092160 (275M) [video/x-msvideo]
Saving to: `Wikimania05-AP1.avi'

36% [====================================> ] 104,857,347 --.-K/s in 1m 45s

2013-11-03 20:04:07 (976 KB/s) - Connection closed at byte 104857347. Retrying.

--2013-11-03 20:04:08-- (try: 2) http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Connecting to web.archive.org (web.archive.org)|207.241.224.26|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:04:08 GMT
Content-Type: video/x-msvideo
Content-Length: 288092160
Connection: keep-alive
Memento-Datetime: Fri, 10 Aug 2007 11:30:28 GMT
Link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first last memento"; datetime="Fri, 10 Aug 2007 11:30:28 GMT"
X-Archive-Orig-Connection: close
X-Archive-Orig-Content-Length: 288092160
X-Archive-Orig-Content-Type: video/x-msvideo
X-Archive-Orig-ETag: "5d5c-112bf000-4015cefd252c0"
X-Archive-Orig-Server: Apache
X-Archive-Orig-Accept-Ranges: bytes
X-Archive-Orig-Last-Modified: Thu, 22 Sep 2005 14:16:19 GMT
X-Archive-Orig-Date: Fri, 10 Aug 2007 11:30:28 GMT
X-Archive-Wayback-Perf: [IndexLoad: 10, IndexQueryTotal: 10, RobotsFetchTotal: 5, RobotsRedis: 5, RobotsTotal: 5, Total: 87, WArcResource: 74]
Set-Cookie: wb_total_perf=87; Expires=Sun, 03-Nov-2013 20:05:08 GMT; Path=/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 1
X-Page-Cache: MISS
Length: 288092160 (275M) [video/x-msvideo]
Saving to: `Wikimania05-AP1.avi'

36% [====================================> ] 104,857,347 --.-K/s in 85s

2013-11-03 20:05:33 (1.18 MB/s) - Connection closed at byte 104857347. Retrying.

$ wget --version
GNU Wget 1.13.4 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +ssl/openssl

Wgetrc:
/etc/wgetrc (system)
Locale: /usr/share/locale
Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
-DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib
-I../../lib -D_FORTIFY_SOURCE=2 -Iyes/include -g -O2
-fstack-protector --param=ssp-buffer-size=4 -Wformat
-Wformat-security -Werror=format-security -DNO_SSLv2
-D_FILE_OFFSET_BITS=64 -g -Wall
Link: gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
-Wformat-security -Werror=format-security -DNO_SSLv2
-D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions
-Wl,-z,relro -Lyes/lib -lssl -lcrypto -lz -ldl -lz -lidn -lrt
ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a